601 research outputs found
GPU-Accelerated BWT Construction for Large Collection of Short Reads
Advances in DNA sequencing technology have stimulated the development of
algorithms and tools for processing very large collections of short strings
(reads). Short-read alignment and assembly are among the most well-studied
problems. Many state-of-the-art aligners, at their core, have used the
Burrows-Wheeler transform (BWT) as a main-memory index of a reference genome
(typical example, NCBI human genome). Recently, BWT has also found its use in
string-graph assembly, for indexing the reads (i.e., raw data from DNA
sequencers). In a typical data set, the volume of reads is tens of times of the
sequenced genome and can be up to 100 Gigabases. Note that a reference genome
is relatively stable and computing the index is not a frequent task. For reads,
the index has to computed from scratch for each given input. The ability of
efficient BWT construction becomes a much bigger concern than before. In this
paper, we present a practical method called CX1 for constructing the BWT of
very large string collections. CX1 is the first tool that can take advantage of
the parallelism given by a graphics processing unit (GPU, a relative cheap
device providing a thousand or more primitive cores), as well as simultaneously
the parallelism from a multi-core CPU and more interestingly, from a cluster of
GPU-enabled nodes. Using CX1, the BWT of a short-read collection of up to 100
Gigabases can be constructed in less than 2 hours using a machine equipped with
a quad-core CPU and a GPU, or in about 43 minutes using a cluster with 4 such
machines (the speedup is almost linear after excluding the first 16 minutes for
loading the reads from the hard disk). The previously fastest tool BRC is
measured to take 12 hours to process 100 Gigabases on one machine; it is
non-trivial how BRC can be parallelized to take advantage a cluster of
machines, let alone GPUs.Comment: 11 page
MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
MEGAHIT is a NGS de novo assembler for assembling large and complex
metagenomics data in a time- and cost-efficient manner. It finished assembling
a soil metagenomics dataset with 252Gbps in 44.1 hours and 99.6 hours on a
single computing node with and without a GPU, respectively. MEGAHIT assembles
the data as a whole, i.e., it avoids pre-processing like partitioning and
normalization, which might compromise on result integrity. MEGAHIT generates 3
times larger assembly, with longer contig N50 and average contig length than
the previous assembly. 55.8% of the reads were aligned to the assembly, which
is 4 times higher than the previous. The source code of MEGAHIT is freely
available at https://github.com/voutcn/megahit under GPLv3 license.Comment: 2 pages, 2 tables, 1 figure, submitted to Oxford Bioinformatics as an
Application Not
Hidden Trends in 90 Years of Harvard Business Review
In this paper, we demonstrate and discuss results of our mining the abstracts
of the publications in Harvard Business Review between 1922 and 2012.
Techniques for computing n-grams, collocations, basic sentiment analysis, and
named-entity recognition were employed to uncover trends hidden in the
abstracts. We present findings about international relationships, sentiment in
HBR's abstracts, important international companies, influential technological
inventions, renown researchers in management theories, US presidents via
chronological analyses.Comment: 6 pages, 14 figures, Proceedings of 2012 International Conference on
Technologies and Applications of Artificial Intelligenc
Explicit Visual Prompting for Universal Foreground Segmentations
Foreground segmentation is a fundamental problem in computer vision, which
includes salient object detection, forgery detection, defocus blur detection,
shadow detection, and camouflage object detection. Previous works have
typically relied on domain-specific solutions to address accuracy and
robustness issues in those applications. In this paper, we present a unified
framework for a number of foreground segmentation tasks without any
task-specific designs. We take inspiration from the widely-used pre-training
and then prompt tuning protocols in NLP and propose a new visual prompting
model, named Explicit Visual Prompting (EVP). Different from the previous
visual prompting which is typically a dataset-level implicit embedding, our key
insight is to enforce the tunable parameters focusing on the explicit visual
content from each individual image, i.e., the features from frozen patch
embeddings and high-frequency components. Our method freezes a pre-trained
model and then learns task-specific knowledge using a few extra parameters.
Despite introducing only a small number of tunable parameters, EVP achieves
superior performance than full fine-tuning and other parameter-efficient
fine-tuning methods. Experiments in fourteen datasets across five tasks show
the proposed method outperforms other task-specific methods while being
considerably simple. The proposed method demonstrates the scalability in
different architectures, pre-trained weights, and tasks. The code is available
at: https://github.com/NiFangBaAGe/Explicit-Visual-Prompt.Comment: arXiv admin note: substantial text overlap with arXiv:2303.1088
AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition
Raw videos have been proven to own considerable feature redundancy where in
many cases only a portion of frames can already meet the requirements for
accurate recognition. In this paper, we are interested in whether such
redundancy can be effectively leveraged to facilitate efficient inference in
continuous sign language recognition (CSLR). We propose a novel adaptive model
(AdaBrowse) to dynamically select a most informative subsequence from input
video sequences by modelling this problem as a sequential decision task. In
specific, we first utilize a lightweight network to quickly scan input videos
to extract coarse features. Then these features are fed into a policy network
to intelligently select a subsequence to process. The corresponding subsequence
is finally inferred by a normal CSLR model for sentence prediction. As only a
portion of frames are processed in this procedure, the total computations can
be considerably saved. Besides temporal redundancy, we are also interested in
whether the inherent spatial redundancy can be seamlessly integrated together
to achieve further efficiency, i.e., dynamically selecting a lowest input
resolution for each sample, whose model is referred to as AdaBrowse+. Extensive
experimental results on four large-scale CSLR datasets, i.e., PHOENIX14,
PHOENIX14-T, CSL-Daily and CSL, demonstrate the effectiveness of AdaBrowse and
AdaBrowse+ by achieving comparable accuracy with state-of-the-art methods with
1.44 throughput and 2.12 fewer FLOPs. Comparisons with other
commonly-used 2D CNNs and adaptive efficient methods verify the effectiveness
of AdaBrowse. Code is available at
\url{https://github.com/hulianyuyy/AdaBrowse}.Comment: ACMMM202
Graph based Label Enhancement for Multi-instance Multi-label learning
Multi-instance multi-label (MIML) learning is widely applicated in numerous
domains, such as the image classification where one image contains multiple
instances correlated with multiple logic labels simultaneously. The related
labels in existing MIML are all assumed as logical labels with equal
significance. However, in practical applications in MIML, significance of each
label for multiple instances per bag (such as an image) is significant
different. Ignoring labeling significance will greatly lose the semantic
information of the object, so that MIML is not applicable in complex scenes
with a poor learning performance. To this end, this paper proposed a novel MIML
framework based on graph label enhancement, namely GLEMIML, to improve the
classification performance of MIML by leveraging label significance. GLEMIML
first recognizes the correlations among instances by establishing the graph and
then migrates the implicit information mined from the feature space to the
label space via nonlinear mapping, thus recovering the label significance.
Finally, GLEMIML is trained on the enhanced data through matching and
interaction mechanisms. GLEMIML (AvgRank: 1.44) can effectively improve the
performance of MIML by mining the label distribution mechanism and show better
results than the SOTA method (AvgRank: 2.92) on multiple benchmark datasets.Comment: 7 pages,2 figure
Lymph Node Metastasis in Differentiated Thyroid Cancers
Lymph node metastasis is common in differentiated thyroid cancers. Therapeutic neck dissection removes macroscopic nodal metastasis, reduces local recurrence, and facilitates cancer surveillance. On the other hand, microscopic nodal metastasis is also increasingly recognized as a potential cause of persistent disease or early recurrences. Prophylactic neck dissection, by removing microscopic nodal metastasis, has been proposed to reduce recurrence and prevent future reoperation. When cancer recurs, regional nodal recurrence is most common, and the management should be individualized. We hereby present a narrative review on the management of nodal metastasis in differentiated thyroid cancers
D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and Localization
Recently, many detection methods based on convolutional neural networks
(CNNs) have been proposed for image splicing forgery detection. Most of these
detection methods focus on the local patches or local objects. In fact, image
splicing forgery detection is a global binary classification task that
distinguishes the tampered and non-tampered regions by image fingerprints.
However, some specific image contents are hardly retained by CNN-based
detection networks, but if included, would improve the detection accuracy of
the networks. To resolve these issues, we propose a novel network called
dual-encoder U-Net (D-Unet) for image splicing forgery detection, which employs
an unfixed encoder and a fixed encoder. The unfixed encoder autonomously learns
the image fingerprints that differentiate between the tampered and non-tampered
regions, whereas the fixed encoder intentionally provides the direction
information that assists the learning and detection of the network. This
dual-encoder is followed by a spatial pyramid global-feature extraction module
that expands the global insight of D-Unet for classifying the tampered and
non-tampered regions more accurately. In an experimental comparison study of
D-Unet and state-of-the-art methods, D-Unet outperformed the other methods in
image-level and pixel-level detection, without requiring pre-training or
training on a large number of forgery images. Moreover, it was stably robust to
different attacks.Comment: 13 pages, 13 figure
- …